FOSS-Based Grid Computing

نویسنده

  • A. Mani
چکیده

In this expository article we will be primarily concerned with core aspects of Grids and Grid computing using OSS with some emphasis on utility computing. It is based on a technical report entitled 'Grid-Computing Using Linux' by the present author. Grids have made great progress in the area of scientific computing and collaboration projects in the recent past. They have also moved into the domain of business computing more recently. Linux and Open Source Software (OSS) on the other hand have steadily progressed into every sphere of computing. The future holds a great lot more for the three. Progress in Grids will all be for Linux as Grids are made for the *nixes. In this expository article we will be primarily concerned with core aspects of Grids and Grid computing using OSS with some emphasis on utility computing. Foster and Kesselman [B2] define a computational Grid "as a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. Grid computing is concerned with coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. The key concept is the ability to negotiate resource-sharing arrangements among a set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose." A Grid must ➢ Coordinate decentralized resources through a decentralized shared mechanism. The users and resources will be in different domains and almost all control must be shared. ➢ Use standard, open, general-purpose protocols and interfaces. ➢ Deliver nontrivial quality of services. A Grid is built from multi-purpose protocols and interfaces that deal with resource discovery, authentication, authorization and resource access in particular. It is necessary that these protocols and interfaces be standard and open, else the system is an application specific system. All of the above criteria does leave some room for debate on what a grid actually is (see Bhuyya et.al [G1]). But it is accepted that systems like Sun’s Sun Grid Engine, Platform’s Load Sharing Facility, or Veridian’s Portable Batch System are not grids as they involve centralized control of hosts and have complete control over user requests and allocation. The above three criteria apply most clearly to the various large-scale Grid deployments used within the scientific community. These include distributed data processing systems like GriPhyN, PPDG, EU DataGrid, iVDGL, DataTAG and the TeraGrid. These systems integrate resources from multiple institutions despite each using their own policies and mechanisms. They use open, general-purpose (Globus Toolkit) protocols to deal with negotiating and managing sharing, security, reliability, and performance. The next generation of IT evolution is bound to involve 'Utility Computing' in a big way. The design of which is based on a service provisioning model, where users or consumers pay providers for using computing power only when they need to. The main benefits of the utility computing model for service providers are: ➢ The computing service provider need not set up actual hardware and software components to satisfy a single solution or user, as in the case of traditional computing. ➢ Providers can reallocate resources with ease by the use of virtualized resources, that can be created and assigned dynamically to various users when needed. ➢ The operational costs for providers are reduced due to better resource utilization. The TCO is also reduced. The design aims and benefits of grids are naturally suited for use as utility computing environments. The interoperability of grids is substantially enhanced by the use of open standard service-based architectures. This makes grids all the more suitable for utility computing. Grid applications have been and are being mostly used in scientific research and collaboration projects. In recent times there has been a great increase in the number of Grid applications in business and industryrelated projects too. Grids provide the following benefits: ➢ Access to extra resources needed for solving problems that were previously unsolvable due to lack of resources. ➢ Transparent and instantaneous access to geographically distributed resources of a heterogeneous nature (including hardware and software). ➢ Improved productivity with reduced processing time. ➢ The infrastructure for aggregation of resources from multiple sites to meet sudden demands. ➢ Infrastructure for utilizing under­utilized or unused computing resources that are otherwise wasted. ➢ Optimal utilization of computing facilities to justify IT capital investments. ➢ Infrastructure for coordinated resource sharing and problem solving through virtual organizations that facilitate inter departmental and organizational collaboration. ➢ The gross effort needed for administration is reduced in comparison to managing multiple stand­alone systems. Linux and Grid Computing: Grid computing by definition must be based over open source software and operating systems. Commercial closed source operating systems are hindered in the gridcomputing sphere by a wide variety of problems including flexibility, security, integration with applications, lack of tools and scalability among others. That is if we choose to relax the primary criteria. Among the different open source operating systems available, Linux is most suited for the grid computing arena too. Some of the reasons for this are as follows: ➢ The Open Grid Services Architecture (OGSA) and the Globus toolkit have been built in compliance with open source standards and licenses. The latter deals with the issues of information discovery, security, portability, resource management, data management, communication and error analysis in the Grid context. The development of these have followed the same open line of development of Linux. The long-term success of grid computing depends on free and open standards, open software, open infrastructure and the development of grid services for business purposes. Linux is open source operating system that has progressed along similar lines and is perfectly compatible with the maintainability of grid computing standards and orientation towards the development of Grid related services. ➢ Almost all of the computational grid networks developed in the scientific and computing departments of universities and laboratories have been over Linux or Unix. The available empirical evidence says that Linux is the best available operating system for grid computing. Few have dared to risk so much of resources on operating systems like Windows NT or Windows XP. ➢ Many of the benefits of grids for business purposes require empirical confirmation as the associated optimization problems are not sufficiently tractable. This means that such grids would be developed through smaller increments of resources. The open nature of Linux along with its rock solid stability can sustain such a development scenario even in the face of low capital investments. ➢ Almost all of the major commercial forays into the grid computing have been Linux or Unix centric. Linux is central to IBM's Grid strategy. Sun has released a Linux specific version of its Grid software (5.3+). Oracle's 10g package is Grid enabled and runs on Linux. ➢ The introduction of closed source software into massive Grids is bound to generate enough distrust and security concerns to the point of inducing massive wastage of resources. Grid concepts and components In this section, we consider different grid concepts and explain associated terminology in detail. Types of resources A grid is a collection of machines, sometimes referred to as “nodes,” “resources,” “members,” “donors,” “clients,” “hosts,” “engines,” and many other such terms. They all contribute any combination of resources to the grid as a whole. Some resources may be used by all users of the grid while others may have specific restrictions. Computation Computers in a grid can vary in CPU speed, architecture, software platform, and other associated factors, such as memory, storage, and connectivity. The computation resources of a grid can be used in the following three ways: ➢ By running an existing application on an available machine on the grid rather than locally ➢ By running applications specifically capable of parallel computation and ➢ By running applications that needs to be executed many times, on many different machines in the grid. The “Scalability” of a grid is a measure of how efficiently the multiple processors on a grid are used. If doubling the number of processors makes an application complete in half the time, then the grid is said to be perfectly scalable. Scalability is usually expressed as a percentage with respect to this ideal situation and is necessarily application specific.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Internet - a Virtual Laboratory for Distributed Computing

For a computer science course of distributed computing, Internet is, at the same time, an object for study, and a (virtual) library of resources (textbooks, tutorials, FAQs, software packages (especially FOSS), project examples. But especially, a place for experiments with grid, cluster, and distributed computing tasks, i.e. a virtual laboratory for High Performance Computing. In the paper, som...

متن کامل

Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid

Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...

متن کامل

Weighted-HR: An Improved Hierarchical Grid Resource Discovery

Grid computing environments include heterogeneous resources shared by a large number of computers to handle the data and process intensive applications. In these environments, the required resources must be accessible for Grid applications on demand, which makes the resource discovery as a critical service. In recent years, various techniques are proposed to index and discover the Grid resource...

متن کامل

Misconceptions and Barriers to Adoption of FOSS in the U.S. Energy Industry

In this exploratory study, we map the use of free and open source software (FOSS) in the United States energy sector, especially as it relates to cyber security. Through two surveys and a set of semi-structured interviews— targeting both developers and policy makers—we identified key stakeholders, organizations, and FOSS projects, be they rooted in industry, academia, or public policy space tha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006